59 research outputs found

    Approximate NN Queries on Streams with Guaranteed Error/performance Bounds

    Get PDF

    Flexible String Matching Against Large Databases in Practice

    Get PDF

    Small domain randomization

    No full text

    Categorical skylines for streaming data

    No full text
    10.1145/1376616.1376643Proceedings of the ACM SIGMOD International Conference on Management of Data239-25

    Declustering R-Tree on Multi-Computer Architectures

    No full text
    We study a method to decluster a spatial access method (and specifically an R-tree) on a shared-nothing multi-computer architecture [9]. Our first step is to propose a software architecture, with the top levels of the R-tree on the 'master'server' and the leaf nodes distributed across the servers. Nest, we study the optimal capacity of leaf nodes, or 'chunk size'. We express the response time on range queries as a function of the 'chunk size', and we show how to optimize it. This formula assumes that the 'chunks' are perfectly declustered. We propose to use the Hilbert curve to achieve such a good declustering.Finally, we implemented our method on a network of workstations and we compared the experimental and the theoretical results. The conclusions are that (a) our formula for the response time is accurate (the maximum relative error was 30%; the typical error was in the vicinity of 10-15%) (b) the Hilbert-based declustering consistently outperforms a random declustering (c) most importantly, although the optimal chunk size depends on several factors (database size, size of the query, speed of the network), a safe choice for it is 1 page (whichever is the page size of the operating system). We show analytically and experimentally that a chunk size of 1 page gives either optimal or close to optimal results, for a wide range of the parameters

    Relaxing join and selection queries

    No full text
    VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases199-21
    corecore